Links

Is the 1000 genomes data available in genome browsers?

Answer:

1000 Genomes Project data is available at both Ensembl and the UCSC Genome Browser.

More information on accessing 1000 Genomes Project data in genome browsers can be found on the Browser page.

Ensembl provides consequence information for the variants. The variants that are loaded into the Ensembl database and have consequence types assigned are displayed on the Variation view. Ensembl can also offer consequence predictions using their Variant Effect Predictor (VEP).

You can see individual genotype information in the Ensembl browser by looking at the Individual Genotypes section of the page from the menu on the left hand side.

Related questions:

Are all the variants displayed on the 1000 Genomes Project Browser discovered by the project?

Answer:

No, not all the variants in the browsers produced by the 1000 Genomes Project were discovered by the 1000 Genomes Project.

The data from the 1000 Genomes Project is available in a number of browsers, including browsers produced by the 1000 Genomes Project, which reflect the major data releases associated with the pilot, phase 1 and phase 3 publications from the 1000 Genomes Project. More information on this is available on the browsers page.

The content of the 1000 Genomes Project Browsers, maintained during the 1000 Genomes Project, are based on custom versions of the Ensembl browser. These databases contain the Ensembl core features (genes and transcripts), regulatory elements from the Ensembl Regulatory Build and variation data from the Ensembl Variation database.

As well as 1000 Genomes Project variation data, Ensembl variation contains data from dbSNP, ClinVar, COSMIC, dbGaP, dbVAR, EGA and many other sources.

Related questions:

Can I access the databases associated with the 1000 Genomes browser?

Answer:

We provide a public MySQL instance with copies of the databases behind the 1000 Genomes Project Ensembl browser. These databases are described on our public instance page. More information about the browsers and their history can be found on the browsers page.

Related questions:

Can I BLAST against the 1000 Genomes data sets?

Answer:

The 1000 Genomes raw sequence data represents more then 30,000x coverage of the human genome and there are no tools currently available to search against the complete data set. You can, however, use the Ensembl or NCBI BLAST services and then use these results to find 1000 Genomes Project variants in dbSNP.

Related questions:

Can I get individual genotype information from browser.1000genomes.org?

Answer:

The 1000 genomes browser at browser.1000genomes.org but all data is accessible from the Ensembl browser at grch37.ensembl.org You can see individual genotype information in the browser by looking at the Sample Genotypes section of the a variant page. This can be reached from the menu on the left hand side of the page. You can find a particular variant by putting its rs number in the search box visible at the top right hand corner of every browser page.

Related questions:

Where can I get consequence annotations for the 1000 genome variants?

Answer:

The final 1000 Genomes phase 3 analysis calculated consequences based on GENCODE annotation and this can be found in the directory: release/20130502/supporting/functional_annotation/

Ensembl can also provides consequence information for the variants. The variants that are loaded into the Ensembl database and have consequence types assigned and displayed on the Variation view. Ensembl can also offer consequence predictions using their Variant Effect Predictor (VEP).

Please note the phase 3 annotations and the Ensembl annotations visible via the browser due to using different versions of gene and non coding annotation.

Related questions:

Where does the Ancestral Allele Information for your variants come from?

Answer:

The ancestral alleles associated with the phase 1 release where generated using two different processes.

The SNP ancestral alleles were derived from Ensembl Compara release 59. The alignments used to generate them can be found in the phase1/supporting directory.

The indel ancestral alleles were generated using an separate process

The deletions should not have any ancestral alleles

Related questions:

Why are the coordinates of your pilot variants different to what is displayed in Ensembl or UCSC?

Answer:

The pilot data for the 1000 genomes project was all mapped to NCBI36/hg18 build of the human assembly. When the data was been loaded into dbSNP it was mapped to GRCh37/hg19 which is accessible from both Ensembl and UCSC but this does mean that the coordinates from the pilot data on the 1000 Genomes ftp site will be different to the coordinates presented in Ensembl and UCSC.

You can also view 1000 Genomes variants mapped to GRCh38 on Ensembl and UCSC.

Related questions:

Why isn't my SNP in browser.1000genomes.org?

Answer:

Ensembl and UCSC Genome Browser both import their variant data from dbSNP. When new 1000 Genomes variants have been released it can take some time for them to be accessioned by dbSNP and make their way to the browsers.

When this happens we try to ensure there is a version of our own browser which displays the data in the meantime. Both Ensembl and UCSC support attaching VCF files to them for visualisation

Related questions: